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Abstract 

This paper presents our studies on the rearrangement of hnks in the structure 
of websites for the purpose of improving the valuation of a page or group of pages 
t/3 ' as established by a ranking function as Google's PageRank. We build our topo- 

logical taxonomy starting from unidirectional and bidirectional rooted trees, and 
up to more complex hierarchical structures as cyclical rooted trees (obtained by 
CN ! closing cycles on bidirectional trees) and PR-digraph rooted trees (digraphs whose 

' condensation digraph is a rooted tree that behave like cyclical rooted trees). We 

give different modifications on the structure of these trees and its effect on the 
valuation given by the PageRank function. We derive closed formulas for the 
PageRank of the root of various types of trees, and establish a hierarchy of these 
topologies in terms of PageRank. 

' Keywords: PageRank, world wide web topology, link structure. 
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■ 1 Introduction 

H 

_Cy_' Google is still today's most popular search engine for the World Wide Web, and the 

key to its success has been its PageRank algorithm [4], which ranks documents based 
primarily on the link structure of the web. Simply put, PageRank considers a link from 
a page H to another page J as a weighted vote from H in favour of the importance 
of J, where the weight of the vote of H is itself determined by the number of links 
(or voters) to H. Therefore, part of the game of the electronic business today is to 
find ways of lifting a page's link popularity, and specifically the PageRank, by either 
obtaining the vote of a very important page (which is unlikely) or manufacturing a 
large set of pages that would be "willing" to link to a client's page. For the latter 
solution, known as link farms in the jargon of the Search Engine Optimisation (SEO) 
community, much care must be taken since it is widely believed that Google had tuned 
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up its original PageRank algorithm to detect fictitious linking and similar forms of 
spamming (e.g. the 2003 "Florida" update, see [9]). 

At the heart of the challenge of improving a page PageRank value is the role 
played by the topology of the web. This is a widely recognised fact as there can be 
found in the internet many SEO analyses of link patterns, together with tips on how 
to rearrange these to raise the PageRank of specific pages. On the theoretical side, 
having acknowledged that the World Wide Web should be treated as a directed graph, 
there are various publications that propose different graph decompositions on regular 
patterns, as a way to improve PageRank computation (e.g. [2], [H]), and news that 
suggests that newly acquired technology by Google, in the hope to enhance PageRank, 
is based on localisation of the computations on certain tree structures underlying the 

Web m, M)- 

Motivated by these graph combinatoric challenges particular to the Web, we have 
studied the PageRank formula from a mathematical perspective, and its relation with 
the web site's topology, with the twofold goal of accelerating the computation of Page- 
Rank and maximising its value for an specific page or set of pages. We summarise here 
all our findings starting from (unidirectional) rooted trees and up to more complex 
hierarchical structures. Ultimately, our academic goals are to disclose some of the graph 
combinatorics underlying the World Wide Web and this popular ranking function, and 
to contribute to the mathematical foundation of many heuristics and ad hoc rules 
in used by the SEO community in its attempt to tweak the valuations assigned by 
PageRank. 

2 Some preliminaries on Graph Theory 

In this paper we will use some standard concepts and results about directed graphs, 
which we detail in this section in order to fix our notation. 

By a digraph D we mean a pair P = {V, A) where y is a finite nonempty set 
and A (Z V X V \ {{v,v) : v G V}. Elements in V and A are called vertices and 
arcs respectively. For an arc (n, v) we will say that u is adjacent to v, and we may 
sometimes also use uv to denote an arc {u,v). The order and the size of T> are, 
respectively, Card{V) and Card{A). If t; is a vertex, the in-degree, id{v), of v is the 
number of arcs {u,v) in A. Similarly, the out-degree, od{v), of v is the number of 
arcs {v, u) in A. 

A sequence of vertices viV2 . . . Vq,q > 2, such that {vi,Vi+i) G A for z = 1, 2, . . . , g — 1 
is a walk of length q — 1 joining vi with Vq or more simply a vi-Vq walk. If the 
vertices of ^1^2 . . . Vq are distinct the walk is called a path. A cycle of length g or a 
g-cycle is a path V1V2 ■ ■ - Vq closed by the arc VqVi. A digraph is acyclic if it has no 
cycle. By a semipath joining vi with Vq we mean a sequence of distinct vertices 
V1V2 . . .Vq,q > 2, such that (vj, fj+i) € A or {vi+i,Vi) € ^ for i = 1, 2, . . . , g — 1. 

A digraph is connected if for each pair u and v of distinct vertices, there is a 
semipath joining u with v. By a subdigraph of the digraph {V, A) we mean a digraph 
{W, B) such that W C V and B C A. The subdigraph is called a partial digraph 
when W = V . The induced subdigraph by the digraph {V, A) on W d V \s the 
digraph {W, A/W) where A/W = Ar\{W xW). 

For an acyclic digraph there exists at least one vertex v (resp. u) such that od{v) = 
(resp. id{u) = 0). Such vertex will be called a maximal (resp. minimal) in the 
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digraph. Moreover, the vertices in an acycHc digraph (V, A) can be distributed by 
levels Nq,Ni, . . . , where Nq = {v £ V : v is maximal in (1/,^)} and, recursively for 
p>0, 

p—i p— 1 

Np = {v £ V \ \^ Ni : V is maximal in the induced subdigraph on y \ |^ -^j} 

i=0 1=0 

Thus one has a partition olV, V = Nq U A'^i U • • • U Nfi, h being the height of the 
digraph, i.e. the last index such that N^, 7^ 0. 

3 Short Introduction on PageRank 

The mathematical view of the World Wide Web is as a digraph W = (y,A), where a 
vertex represents any document posted on the web (a page), and an arc (6, a) indicates 
that there is a link from page b to page a. In this setting, Brin and Page proposed in 
[1] to evaluate each page in the Web with a positive real number, which they named 
its PageRank, given by the formula (in its refined version from [5]): 

{b,a)eA ^ ' 

where V{a) is the PageRank of page a, od(6) is the number of links going out of page 
6 (the out-degree of 6), a is a constant that can take any real value in the interval 
(0, 1) (although Brin and Page always prefer to set it to 0.85), is the total number 
of pages of the Web, and the sum is taken over all pages h that have a link to a. 
The motivation, given by the authors, is that formula ([1]) models the behaviour of a 
random surfer of the Web who, being at a certain page fe, either follows one of the links 
shown in that page with probability a, or jumps to any other page with probability 
1 — a, disregarding the contents of the pages. The probability of choosing a link in h 
that takes him to page a depends on the number od(6) of links out of 6; so V{h) / od{b) 
is the contribution of h to the PageRank of a amortised by a. In this setting, the 
PageRank of a is the probability of a user reaching page a directly or after following 
all appropriate links, and the sum of the PageRank of all the pages is 1, and so, forms 
a probability distribution over the Web (see [3j and |T2]). 

Yet another view of PageRank is the analytical formulation given by Brinkmeier 
(see [7]), who conceived the PageRank function as a power series. In this setting, a 
formula is given that highlights the fact that the ranking of a vertex f , as assigned by 
PageRank, depends on the weighted contributions of each vertex in every walk that 
leads into v, being these contributions higher in value for vertices that are nearer in 
distance from v. 

For a given walk p = V1V2 ■ ■ - Vn in the graph {V, A), define the branching factor 
of p by the formula 

^^^^ od{vi)od{v2) ■ ■ ■ od{vn~i) 
Then, for any vertex a G y, we have 

w = ^E E «'^'^^(^) (2) 
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where p : w — > a denotes a walk p starting at vertex w and ending in vertex a, and 
l{p) is the length of this walk p. 



4 Ranking vertices on trees 

Our starting case study is the set of rooted trees, where a tree with root is an acyclic 
digraph for which there exists a maximal vertex r (the root), such that for every vertex 
V ^ r there is a unique v-r path. We denote a tree with root r as . Thus, a tree 
is a connected graph, its root r is unique and all vertices distinct from r have out- 
degree 1, whilst the in-degree may vary. Vertices with in-degree are called leaves. 
The root is the targeted page for improving its PageRank valuation. The height of 
a vertex in a rooted tree is the length of the path from the vertex to the root. The 
level A; of a rooted tree is the set of vertices with height k\ the root is at level A^o- The 
height of a rooted tree is the length of the longest path from a leaf to the root. 

Remark 4.1 Since we are interested in studying the behaviour of PageRank when 
localised in certain subdigraphs of the Web digraph, we think, in particular, of our trees 
as local closed web sites. This means that the value of N in formula (OP is the number 
of vertices in the tree. □ 



Our first result shows that to compute the PageRank of the root of a tree all we 
need to do is count the number of vertices at each level of the tree. 

Theorem 4.2 // a rooted tree has N vertices and height h, then the PageRank of its 
root r is given by the formula 

nr) = ^^Y.^'n, (3) 

A:=0 

where ■= \Ni:\ is the number of vertices of the kth-level, Nk, of the tree. 

Proof: Below we use b € : b ^ a to indicate that vertex b at level Nf^. has a link to 
a. Assume the first level of the tree Ni = {ai, . . . ,an }. Then, according to equation 

CO 

+ f^ + " E PC) 

\ beN2:b^a„_^ 

The index sets {b G N2 : b ^ a^}, for i = 1, . . . , ni, are pairwise disjoint; therefore. 



N 

beN2 
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Repeating the above manipulations on levels N2, N3, and up to level Nh-i, we have 

fc=0 b&Nh 

At the last level all vertices are leaves, which have no in-coming arcs, hence the 
PageRank of any b £ Nfi is (1 — a)/N. Then 

and the result follows. □ 



Remark 4.3 Theorem \4.S\ shows that we can do any rearrangements of links between 
two consecutive levels of a web set up as a rooted tree, and the PageRank of the root 
will be the same. □ 



Remark 4.4 Due to Theorem \4-^ we will from now on describe a rooted tree , 
with root r and h > levels, each of cardinality hq = 1, ni, . . . , n^, as the string 
= lui . . .Uh- Also the PageRank for the root r of , or for any other vertex 
seemed as the root of a subtree in T*", will depend on the height and the number of 
vertices at each level of . Henceforth, we write PageRank of r in the tree T*" as a 
function of the height h, and denote it V{h). □ 



For some regular topologies we can have nice closed formulas for their PageRank. 
Some examples follow below. 



4.1 m-ary trees 

For m, /i > 1 , let T^{h) be the full m— ary tree of height h, i.e. a tree of height 
h whose vertices, except by the leaves, have in-degree m. The 1-ary tree of height 
h, T{{h), is a path of length h. For m > 1, T^{h) has vertices at each level 
= 0, 1, . . . , /i, and the total number of vertices is (m'^"'"^ — 1) /(m — 1). Using Theorem 
14.21 we can quickly calculate the PageRank for the root r (which depends on the height 
h and fixed arity m, and so we denote Vm{h))- This is 

m — 1 ^ ^ f m — 1 \ {ma)^^^ — 1 



VM = (1 - E = (!-«) 

k=0 

and for the l~ary tree 



m" 



■^^^ — 1 J ma — 1 
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4.2 Binomial trees 

A binomial tree is at the core of fundamental data structures such as heaps, and hence, 
it qualifies as a good candidate for a website's topology0 

We use Ti^{h) to denote the full binomial tree of height h. We recall from [H 
§9.1] that 7^^ (0) consists of only one vertex -the root- and, inductively, Tf^{h + 1) is 
two copies of Tf^{h) joint with an arc from the root of one of the T{{h) to the root of 
the other. At each level A; = 0, 1, . . . , /i, Tf^{h) has vertices, and the total number 
of vertices in T^{h) is X^^^q (Ic) ~ Using Theorem 14.21 we get a nice formula to 
easily calculate the PageRank for the root r of T^ {h), namely 



5 Rearrangements of vertices 

We begin our explorations on the possible modifications on the tree structure that will 
improve the valuation of PageRank. Our first result shows that completely erasing the 
vertices farthest away from the root improves the PageRank. This corroborates the 
known fact that the optimal configuration is a star, i.e. a rooted tree of height 1 (see 



Theorem 5.1 // in a tree = Ini ■ ■ - Uh of height h > 1, the last level Nfi is com- 
pletely erased, then the PageRank of its root r, V{h), increases its value. 

Proof: After passing from the tree = Ini . . . n/j , with N = 1 + ni + . . . + n/j vertices 



because < a < 1 and h > 1. □ 

Remark 5.2 Thus, in order to improve the PageRank of the root of a tree one can 
delete as many levels, from highest to lowest, as the context permits. Conversely, if a 
new level of vertices is added to a tree, then the PageRank of its root decreases. □ 

If it were the case that for practical, or any other reason, we were obliged to keep 
certain height, then a natural question is how much can we prune the tree to improve 
on PageRank. The extreme situation is to prune all but one arc at each level, so we 
take that structure as benchmark and called it queue tree. 

^Goodness as always is understood in terms of PageRank. 




e.g. [3], na). 
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Definition 5.3 The queue tree of a tree = Ini . . .Uh is the tree 

TJ = Ini . . . n|^h-i J ^L^_^^ 
L|J+i 

Theorem 5.4 The PageRank of the root of a tree is smaller than the PageRank of the 
root of its queue tree. 

Proof: We proceed recursively from the last level down to [^^y^J • 

(a) The PageRank 'P{h) of the root r of = Ini . . . Uh^iUh is smaller than the 

PageRank V{h) of T ^ = Ini . . . nh_il. Indeed, let = 1 + ni + . . . + Uh, then 



{N-{nh-l))N 



\k=0 



h-1 



{uh - 1)(1 - a) sr^ k 



(iv-K-i))iv^^^ 

Apply the same methodology for = Ini . . .n/j_2n/i_il and T'^ = Ini . . .n/i_2ll, 
and so on, up to [h/2\. At this last step we have 

(6) = Ini . . . n|^h-ijn|^h+ij and we shall see that its PageRank is less than 

that of the queue tree = Ini . . . n|^h-ij Ij^^^- We work separately the cases of h 

L|J+i 

even or h odd. 

{b.i) If /i = 2p — 1 then = Ini . . . Up^iUpl^^^^^, = Im . . . Up^il^^^^^ and 

p-1 p 
N = ni + ... + np + p.LetM= • Then 

(p-1 2p-l 
l + Y^Uka'' -{N -np)aP + ^ 
k=l k=p+l 
/ p-1 2p-l 

y k=l k=p+l 

= M ^(l-aP) + EK >0 

(b.ii) \i h = 2p then T*" = l^i . . . np_inp 1^^^^^, T''~ = Ini . . . 1^^^^^ and N = 

P p+i 
ni + . . . + rip + p + 1. One then shows V'{h) — V{h) > by a similar argument as in 

{b.i). □ 



Remark 5.5 Theorem \5.4\ can not he improved, in the sense that deleting further 
vertices (hut keeping the height) in a queue tree may or may not improve the PageRank 
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of the root. For small values of h, the queue tree is the optimal pruning of a tree for 
increasing PageRank. For example, if h = A the corresponding queue tree is Tg = 
Inilll with PageRank 'P{h), and if ni > 1 and we remove a vertex from level Ni, we 
get the tree = l(ni — 1)111 with PageRank V'{h), and their difference is 

for any a such that 0.27568 < a < 1. 

For larger values of h, an improvement of PageRank will depend on a and on the 
cardinalities of the levels Ni, . . . , Nyh--i_^. There are also some improvements that can 
he done on queue trees of particular trees, such as m-ary and binomial. □ 



6 Hierarchies of trees by height and size 

In what follows we assume that ^ < a < 1, an interval of useful values for a in practice 
(see the analysis on this subject in |12]). We want to order the m-ary and binomial 
trees with respect to their PageRank. Which tree structure is best for PageRank? Our 
first result on this theme gives a hierarchy with respect to the height. 

Theorem 6.1 For values of the height h sufficiently large, we have 

Vi{h) > Vb{h) > V2{h) > V'sih) >...> Vm{h) . . . 

Proof: We have to compute the appropriate limits: 

IP i^t^ r ^fc(^) (fc-l)(ma-l) ^ ^ ^ 

1. l^or 1 < k < m, lim — — — - = — > 1, from where we conclude 

h-^oo Vm{h) [m — l){ka — 1) 

that Vk{h) > Vm{h). 

V2{h) _ a + l (2a)^+i-l 2^+^ fe^pp 
■ Vbih) 2(2a - 1) (a + 1)^^+1 2^"+^ - 1 ^ ' 

Vb{h) _ (l-«)(/. + l)(^)" fe-,oo 

Viih) l-ah+^ ^ u . u 

Next we classify the PageRank of the m-ary and binomial trees with respect to 
their order. (Beware that we will express the order in terms of the height, and therefore 
we keep the notation Vmih) and Vbih) that remarks the dependency of PageRank on 
the height of the tree.) 

Theorem 6.2 For sufficiently large order N and a > 0.58, we have 

. . . Vm{h) >...> V^{h) > Vb{h) > Viih) > V3{h) > V2{h) > Vi{h) 
Proof: The proof splits into three cases. 

(a) If 1 < k < m and N >> then Vmih) > Vkih) : A fc-ary tree of height h 

has N = — many vertices. An m-ary tree of height h' has the same number of 

k 1 
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vertices as m a /c-ary tree ii, and only it, — = , or equivalently n +1 = 

k — 1 m — 1 

^'^9m (y^^Eiik^^^ - 1) + 1^ « logm (y^^Efik^^^)^ ^ where the last relation indicates an 
equivalence among infinite large quantities. Then 

Vkih) , ima - 1) ikaf+^ - 1 

lim -— = lim 



h^oo Vm{h) h^oo {ka - 1) (ma)'^'+i - 1 

(ma — 1) 1 /a 

— lim ' ' 



h^oo (ka-l) (ma)'°9™^fY- VaWmfc 

since k < m implies loQmk < 1, and in consequence — ; r < 1. 

(b) Ifm>2andN»0 then Vm{h) > Vi{h): 

Vijh) _ h'+i _ (ma - 1) (1 - g rn-i ) 

P„^(/i) ~ (m«)h+i-i - (1 - q) (ma)'^+i - 1 ~^ 

m'^+-^ — 1 ma—1 

(c) If N » then V^ih) > Vb{h) > Vi{h): A binomial tree of height h' has 2 
vertices. For m > 2, an m-ary tree of height h has same cardinality of a binomial tree 

of height h' if, and only if, = 2^* , or equivalently, 

m — 1 



h' 



h' = loga ~ log 

m 



We then show that 



h+l 



The limit L is or oo depending on ™°g2 "» being less than or greater than 1, 
respectively. We showed L = 0ifm<4orm = 5 and q < ao, where oq is the 
irrational number solution of Sag = (1 + ao)^°^2 namely, oq = 0.57016 .... On the 
other hand, L = ooifm>6orm = 5 and a > oq. □ 



6.1 Refining the hierarchy 

According to our results there are many non uniform ways in which one can improve 
the PageRank in our binomial or m-ary trees, yet keeping the height as a constraint 
for maintaining a hierarchically organised website: it is sufficient to remove vertices 
at farther distance from the root. However, there are also non trivial uniform trees 
with better PageRank than the binomial, with respect to height (Theorem 16. ip . We 
present one such possibility which can be seen as a basic backbone for a website with 
different levels. Our intention with this example is to illustrate the following fact: to 
optimise the PageRank values of certain pages of a web site is in general a hard task, 
as there could be exponentially many possible rearrangements. We will come back to 
this point in section [71 

We define the path tree of height h, denoted 'Tp{h), as a tree with a root from 
which hangs h paths of /i, h — 1, . . . , 2, and 1 vertices. Observe that Tp{h) has h 
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vertices at level 1 (connected to the root), h — 1 vertices at level 2, h — 2 vertices at 
level 3, . . . , one vertex at level h. Hence, |7J(/i)| = 1 + h{h + l)/2 and if r is the root 
of Tp{h), we have its PageRank, Vp{h)^ is given by 

Vp{h) = ^ {l + ha + {h- l)a^ + {h- 2)a=^ + . . . + 2a^~^ + a^) 



h / 1 \ fc 



2(1 - a) 

2 + {h + i)h y ' " ^'"V" 

2(l-a) / / . ^ 1 ^ 



2 + (/i + i)/i\ (i-a)2i ^ ^v«y V« 

We then show that 

Vpih) 2 (2 \^ ( ^ a^+^ + a{l - a)h - a'^\ h^oo 

IH — > oo 



Vb{h) 2 + {h + l)h\l + aJ \ (1 - «) 

Thus Vp{h) > Vb{h) for sufficiently large h, although we have checked the inequality 
computationally for values of /i > 3 (for /i < 3 the binomial tree and the chain tree are 
the same). 

On the other hand, one can show that h~^<x, hence the relative 

position of Vp{h) and Vi{h) in the hierarchy depends on whether q is > 1/2 or < 1/2. 

6.2 Hierarchies for queue trees 

Surprisingly, for the queue trees of m-ary and binomial trees, the same ordering of their 
PageRank with respect to height and order holds. For the m-ary (resp. binomial) tree 
of height /i, we use Vq^mih) (resp. 'Pq,fe(/i)) to denote the PageRank of (the root 
of) its queue tree. These values will depend on the parity of the height h, since 
by Definition 15.31 if ^ = 2p — 1 then = Ini . . . np_i 1_^^^^, and if h = 2p then 

p 

Tq = Irii . . . np_i 1_^^^^. Thus, for m > 1, 



m-l +-P 



ma — 1 a — 1 



_ , , I- a ({may -I „aP+i - 1 

Vq,m{2p) = ^^^^ — ^ '—- + aP — 

— — ^ + p + 1 V ma — 1 a — 1 

(the case m = 1 is trivial since the 1-ary queue tree coincides with the 1-ary tree). 
And, 

I- a f^f2p\ k 

' ^ ' 'a" + aP I 
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Theorem 6.3 (i) For sufficiently large height h, we have 

rc,l{h) > Vc,,b{h) > Pq,2(/i) > Vq,3{h) >...> Vc^,m{h) . . . 

(a) For sufficiently large order N and a > 0.58, we have 

The proofs of {i) and [ii) are more involved and longer than previous theorems. We 
shall give sufficient pointers so that readers may reproduce them. 

Part {i): We need to study separately the cases of h being even or odd. The crucial 
observation is: 

Lemma 6.4 //O < a < 1 then 

p—i 

(1 + a)2p-2 < f^P- IV ^ ^p^^ < (1 ^ ^)2p-l 

t^oV k J a-1 
(To show this observe that for < a < 1 then the quotient of (1 + a)^^~^ and 
(^V^)'^'^ °'a-i t^nds to a constant L, with 1<L<1 + q, asp grows.) 

Using this lemma, we obtain the same limits 1), 2) and 3) in Theorem 16. II for h = 2p 
and for h = 2p — 1. 

Part (ii): As in (i) we need to study separately the cases of h being even or odd. 
Observe that for the same type r of queue tree (r being m-ary, binomial, etc.), by 
Theorem 15. H the queue tree of height 2|? — 1 is obtained by deleting the level A^2p of 
the queue tree of height 2p, and hence, 

Pq,,(2p-1) >Pq,.(2p) 

Therefore, for any two queue trees of types ri and T2 we have 

Vq,r, (2p - 1) ^ / n,n(2p) 7^q.n(2p-l) 1 ^ T'q.n (2p) 
max < — — — - , — — -T- > 



Vq,r, {2p) I Vq,r, {2p) ' Pq,., (2p " 1) J T'q,., {2p - 1) 

and since all these quotients are positive, if any of them reduces to zero then all 
those that are to the right hand side (the ones that are smaller) will also reduce 
to zero. Hence, to prove that Vq^ri{h) < Vq,T2ih), it will be enough to prove that 

0. 



Pq,ri(2p - 1) p->oo 



Now to obtain the inequalities claimed in (ii) follow the scheme of Theorem 

(a) If 1< A; < m then show '^1^^'^^'^^ 0. 

(b) If m > 1 then show — '^'^'^^ — - ^^^^ Q, where h = \V„rn(2p — 1)1 (so this h 

7'q,m(2p-l) 

depends on p) . 

(c) To show Vq^h) > Vq^h) > Vq^i{h), due to the observation that Vq^r{2p — 1) > 
Vq^r{2p), it will be enough to show that Pq^5(2p) > Vq^bC^P — 1) and VqA2p) > 
VqA2p — 1). These two inequalities are shown by taking analogous limits as in 
the proof of part (c) in Theorem 16.21 and applying the same constraints about a. 
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7 The problem of optimising the hnk structure 

As mentioned in the introduction, a theoretically as well as commercially important 
problem is to find a scheme for modifying the link structure of a local web in order 
to improve its ranking, as set by PageRank or any other ranking function. In this 
paper we have presented the most fundamental goal of designing a local web (or fixing 
an already existing one) with a treelike structure, where the PageRank of the main 
page, located at the root of the tree, should have the highest possible value, but at the 
same time the overall structure of the web should satisfy certain conditions given by 
the context. We shall not make precise the details of the context, but are the general 
conditions imposed by design. Let us refer to the context as 11. By virtue of Theorem 
14.21 this translates into the following optimisation problem. 
Main Objective: Given a certain context H, to maximise the function 



for fixed a, such that < a < 1, and all trees = Ini ...ny^ with integer values 
h,ni > 1, 1 < i < h. If the total number of vertices is bounded then we can assure 
that the maximum exists. The complexity of the problem depends mostly on the 
conditions imposed by the context 11. This justifies approaching the solution through 
heuristics. Here we give an ad hoc list of rules that clearly stem from our theorems. 
Rule 1: Due to Theorem 15. H the first action to take is to reduce the height as much 
as the context allows. 

Rule 2: Keep in mind that while applying Rule 1 (and deleting levels), links between 
consecutive levels can be rearrange in any way you like, as long as the context is kept 
consistent, and this has no effect on the root's PageRank value (by Theorem 14. 2p . 
Rule 3: Once the optimal height /i > 1 is attainecH, we delete (as much as possible) 
vertices from levels in the upper half of the tree, trying to get it close to its underlying 
queue tree (Theorem 15. 4p . and those vertices that cannot be deleted should be moved 
as closer to level 1 as possible (by Theorem 14. 2p . 

The above rules of general nature can be complemented by next working on the 
particularities of the queue tree structure. For example, if the applications of rules 1 
to 3 give as a final result a 5"-ary queue tree, then Theorem 16. 3l -(z) tell us that pruning 
more vertices to convert this tree into a binary queue tree, or binomial queue tree (of 
same height) improves PageRank. The caveat is that we have proved Theorem 16.31 
using continuous calculus and, therefore, cannot be applied without doubt for small 
values of the height. To remedy this deficiency, we have computed 'Pq,fe(/i) and Vq^m{h) 
for various m and many small integer values of h, and concluded the following facts, 
which strengthen Theorem \6.3\ -(i): 



^Optimality here again depends on maintaining the context consistent. This height could mean the 
minimal levels of a hierarchy that we need to reflect in the web site; say, for example, of a corporation 
or a hypertext. 




1. Vq,i{h) > Vq,bih) > 7'q,2(/i), for h > 17. 

2. Vq^b{h) > Pq,i(/i), for 2 < /i < 16. 
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3. Vq,b{h) > 7'q,2(/i), for all h > I. 

4. 7'q,l(/l)>Pq,2(/l), for/l>15. 

5. 7^q,2(/i) > ^q,i(/i), for 2<h<U. 

6. V^,2{h) > V^;i{h) > Pq,4(/i) > . . . > V^,„,{h) . . ., for /i > 9. 

7. Pq,2(/i) < 7'q,3(/i) < ^q,4(/i) < • • • < Pq,™ (/l) . . . , for /l = 3,4. 

8. Theorem I6.3l -(z) is "almost" true for h = 5,6,7,8 (all but except some arity m 
from 2 to 6). 

Now, depending on the value of the height of the queue tree obtained by rules 1 to 3, 
we use the appropriate inequality from the above list to guide our pruning correctly 
and raise the root's PageRank. For example, if we had arrived to an m-sxy queue tree 
of height 17, and m > 2, we can delete and move vertices, shaping the tree like a k-axy 
queue tree, for some k < m, or like a binomial tree. 

8 The bidirectional case 

We turn now to trees with bidirectional as well as unidirectional arcs. We use to 
denote a tree rooted at r with both unidirectional and bidirectional arcs, where all 
unidirectional arcs points towards the root r. Formally, a digraph = {V, A) is a 
bidirectional tree with root r if its set of arcs A can be partitioned in two disjoint 
sets Ai and A2 such that: 

• {V,Ai) is a partial tree with root r (the underlying tree of B"^), and 

• if E A2 then vu G Ai, and in this case we say that v is the origin of the 
bidirectional arc vuv. (Intuitively think of a bidirectional arc as a 2-cycle.) 

Observe that for each arc uv G A2 the corresponding bidirectional arc vuv defines 
an infinite number of walks ending at the root r (just as would do any cycle within a 
tree). Henceforth, to the effect of computing the PageRank of r with equation ([2]), we 
can view each arc uv A2 as a path of infinite length hanging from the vertex v, and 
containing alternatively copies of vertices u and v, where at each v hangs a copy of the 
tree rooted at v, 1"" , and at each u hangs a copy of the remainder of the tree rooted at 
u after removing from it the sub-tree , that is, T^\T"" ■ Note that (and T") may 
contain bidirectional arcs. Extending this idea through all bidirectional arcs, we can 
view the bidirectional tree B^ as an infinite tree. Figure [T] shows a bidirectional tree B"^ 
with two bidirectional arcs, vuv and v'u'v' (leftmost tree); next to it the bidirectional 
tree with an infinite branch corresponding to vuv; and the rightmost tree is the full 
infinite tree associated to B^. 

This view of B^ as an infinite tree makes it easier to understand the interpretations 
we do below of equation ([2]) adapted to our trees. To be clear, what we mean by the 
infinite tree associated to B^ is the tree rooted at r, which contains the underlying tree 
previously defined (i.e. the partial tree rooted at r, (y,Ai)), and such that for each 
vertex v ^ r that is the origin of a bidirectional arc vuv in B^ ^ substitute the arc uv 
by a countable infinite path rooted at u , containing alternatively copies of vertices u 
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Figure 1: Bidirectional tree B'^ and its infinite associated tree in two stages. 



and V, where at each v hangs a copy of the tree rooted at v, T^, and at each u hangs a 
copy of the remainder of the tree rooted at u after removing from it the sub-tree T^, 
that is T'^XT". 

Now, let us recall equation V{a) = ^ a^^^"^ D{p). In it, the sum 

is taken over all vertices v connected through a walk to a. In an associated infinite 
tree this walk is a unique path p connecting v with o. This path could have various 
incidence of bidirectional arcs. On the other hand, each bidirectional arc uvu, with 
u ^ r and od{u) = 2, produces an infinite number of walks: u, uvu^ uvuvu, . . . , with 
branching factors D{u) = 1, D{uvu) = 1/2, D{uvuvu) = 1/2^, hence, summing 
over all these walks we get 



Therefore, if the path p : v — > a contains q vertices, each meeting a bidirectional 

arc, the contribution to 'P(a) of the possible walks produced on p is — '^/2)i ' 

If the bidirectional arc is rvr, with od{r) = 1, and hence D{rvr . . . vr) = 1 for any 

walk on this arc, we get that the contribution to Via) is 

(1 — a'^) 

All the above observations lead to the following result on computing the PageRank 
on bidirectional trees. 

Theorem 8.1 Let = iy^A) he a bidirectional tree rooted at r. 

(1) If od{r) = 0, then for all a eV, 

. N _ 1 — a v;^ a'*-''^ 
^^"^ = ^ ^ 2"(l-a2/2)'? ^ ' 

(2) // 0(i(r) = 1 with bidirectional arc rur, then 

, 1 - a a'('') r a r . 

Via) = — -— > — — r, , for a £ ir,u\ (5) 
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and 

1 - a ^ 



— ^ \ ^ U! ' 

^(«) = L 2^(l-aV2)^-i(l-a^) ' ^"'^ ° ^ ^"'^^ 

where in all cases, p : v — ^ a is the unique path from v to a, and l{p) is the length of 
this path; n is the number of bidirectional vertices (i.e. with od{u) = 2) not being an 
end-vertex in p; q is the number of bidirectional arcs meeting p. □ 



In particular, if od{r) = 0, 



since n = q for this case. And if od{r) = 1, 



since n = q — \ for this case. 



At this point we would like to make a digression into the nature of the formulas for 
PageRank we have just deduced. These have their origin in Brinkmeier's equation (eq. 
([2])), which in essence computes the contributions of vertices to the value V{(i) by a 
depth-first search exploration. Our proposed equation for computing the PageRank of 
the root in the case of unidirectional trees (section [H equation ([3]) ) is founded on the 
complementary tree-search routine, namely, breadth-first search; and we would like to 
have a result on the same spirit of counting by levels for the case of bidirectional trees. 

For a breadth-first search type of computation of PageRank on a bidirectional tree, 
we must classify somehow the vertices by levels of the tree. For each A; > 0, the vertices 
at level = {vki, . . . , Vkn^} characterise by the number of bidirectional arcs met 
by their paths which ends in the root, v^i ■ ■ ■ r. Hence, = + ' ■ ■ + '^fc'*'^! where 
n1 denotes the number of vertices at level Nj. having q bidirectional arcs meeting 
their paths to r. Some of these could be null. The non-null many vertices 

contributes to the summation in equations u3 and mh the quantities ttz- and 

(2 — a^)5 

Q k 
n.. rv'^ 

according to either case of od{r) = or od{r) = 1. Thus, we have 



(2-a2)9-i(l-a2; 
the following result. 

Theorem 8.2 Let be a bidirectional tree rooted at r, with N vertices and height 
h>0. 

(l)//„d(r) = 0, PM^I^EEt^ 

k=Oq=0 ^ ' 
-I h k q+1 k 

k=Oq=0 ^ ' ^ ' 

where q is the number of bidirectional arcs met by the path ending in r, but distinct 
from the bidirectional arc incidence with r, if such bidirectional arc exists. □ 
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We can give a more succinct vectorial formulation of the previous result, if we 
develop the sums "by rows" (outmost sum) and group column terms in a vector. 

Theorem 8.3 Let he a bidirectional tree rooted at r with N vertices and height h > 

1 A A 

0. // od{r) = 0, then V{r) = — — — — where Ag = (n^, n^^.!, • • • , n^) and 



N ^(2-q2)9' 



Aq = (ofl^ a'^^^ . . . , ot^\ Similarly, ifod{r) = 1, then V{r) 



A^-Ag 



w/iere A'g = (n^+\n^+J,...,n^"^'). □ 



1 — a 



N ^(2-02)9(1-02)' 
5=0 



8.1 Case of s-cycles 

In this section we generalise the computation of PageRank to bidirectional trees of 
height /i > 1 on which we close permissible cycles of any length obtained by joining 
vertices from level Nj with vertices from level Nk, for < j < k < h. In this way we 
can transform bidirectional arcs vuv into cycles vuvn ■ ■ ■ viv of longer length, where 
the arc uvn close the new cycle inserted in the rooted tree. Also the arc uv of the 
bidirectional arc vuv can be substituted by a new arc ut closing a larger path t . . .vu 
in the tree. In Figure [2] we exhibit some examples of these transformations. 




St St 

Figure 2: Examples of cyclical trees. 



Let us call these classes of digraphs obtained by closing cycles on bidirectional trees 
as cyclical trees. Formally we define a digraph C^' = {V, A) as a cyclical tree with 
root r, if its set of arcs A can be partitioned in two disjoint sets Ai and A2 such that: 

• {V,Ai) is a partial tree with root r (the underlying tree of C), and 

• if uv (z A2 then there is a path V1V2 ■ ■ .Vs-iVs, beginning at vi = v, ending at 
Vs = u and with intermediate vertices and arcs fifi+i in Ai, and in this case we 
say that v is the origin of the cycle VV2 ■ ■ ■ Vs-iuv. 
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We proceed to compute the PageRank of these cychcal trees. Similarly to the 
bidirectional case (which is no other than a 2-cycle) , we have that each cycle uv . . .u 
of length I > 2 and od{u) = 2, produces an infinite number of walks: u, uv . . .u, 
uv . . . uv . . . u, . . . , with branching factors D{u) = 1, D{uv . . .u) = 1/2, D{uv . . .uv 
. . . n) = 1/2^, . . . ; hence, summing over all these walks we get 

rv' 1 

p:u — yu 

Therefore, if the path p : v a contains q vertices, meeting q cycles of length li. 
I2, . . . , Iq, respectively, then the contribution to P(a) of the possible walks produced 
on p is 



1 - a'l /2 1 - a'2 /2 1 - /2 
If the cycle is rv .1 . r, with od{r) = 1, and hence D{rv ... r) = 1, we get that the 

contribution to Via) is 7-. 

(1 — a') 

Theorem 8.4 Let = {V,A) be a cyclical tree rooted at r. 

(1) If od{r) = 0, then for all aeV, 

2«(1 - aV2) • • • (1 - a'V2) 

(2) If od{r) = 1 in the cycle rv\ . . . vi^_^r, then 

^(") = ^2: 2«(i-c.V2)---(l-aV2) ' 

and 

= ^ 2. 2"(1 - a'V2) •••(!- a'.-V2)(l - «S ' « ^i' • • • ' 

where in all cases, p : v a is the unique path from v to a, and I (p) is the length of 
this path; n is the number of bidirectional vertices (i.e. with od{u) = 2) not being an 
end-vertex in p; q is the number of cycles meeting p and of lengths li,l2, ■ ■ ■ ,lg- D 

In particular, if od{r) = 0, n = q, and 

^^i^(2-a'i)...(2-aM 



And if od{r) = 1, n = q — 1, and 



1 — a s--^ a^^P^ 
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9 Properties of bidirectional and cyclical trees 



Analogously to the case of unidirectional trees we shall analyse in this section the 
behaviour of PageRank on bidirectional, and more general, cyclical trees when their 
topology is modified. Our first result shows that on a unidirectional tree changing 
unidirectional arcs to bidirectional enhance the PageRank value of the end-vertices of 
the transformed arc, but reduces the PageRank of the root of the tree. 

Theorem 9.1 // in a unidirectional tree an arc vu, with u ^ r, is changed to a 
bidirectional arc uvu, then V{u) and 'P{v) both increase, hut V{r) decreases. 

Proof: We introduce some notation first. Vx{T^) denotes the PageRank of vertex x 
in the tree with root y; np{T^) denotes the number of vertices at level Np in the 
tree T^. Now, assume that u is at level in the tree (and, hence, v G A^^+i). 
Then, we have that 

p=0 



\p=0 p=k J 



and, therefore, if is the bidirectional tree obtained from by just adding the 
bidirectional arc uvu., we have 

yp=o p=k J 

which shows that the PageRank of the root r decreases. On the other hand, the 
PageRanks of u and v are given by the equations: 

p=k 

and 

I h h 



^^^^^^ = 7V(1 - a72) [ 2 "^^"^ - + ^ ""^^^^ 



p=k p=k+l 



Using same arguments as given for the previous theorem, we can generalized the 
result to the case where the original tree is bidirectional, and some of its unidirectional 
arc (if any) is promoted to being bidirectional. 

Theorem 9.2 Let be a bidirectional tree, and let B'^ be the tree resulting from B^ 
when one of its arcs vu, with u ^ r is transformed into bidirectional arc uvu. Then 
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1. Vu{B'^) = 27^^n(S") > P^S"). 

i — Q^/z 

3. If u'v'u' is a bidirectional arc intersecting the path uvi . . .v^ = r, then 
Vu'iB'n < Vu'{B') and P.^^'O < V,>{B'). 

4- VxiB") < VxiB^) for all vertex x in the path vi . . . Vk = r. 

5. In particular, VriB") < Vr{B"). 

6. The vertices which are neither contained in the path uvi . . .Vk = r nor in the 
bidirectional arcs intersecting this path preserve their original PageRank. □ 



Theorems 19.11 and 19.21 suggest that in order to increase the PageRank of the root 
r of a tree we have to directly promote to bidirectional the arcs incidence to r. The 
consequences of this manipulation is summarized in the following theorem, which is a 
direct consequence of the two previous results. 

Theorem 9.3 Let B^ be a bidirectional tree, with od{r) = 0, and let B'^ be the tree 
resulting from B^ when one of its arcs vr is transformed into bidirectional arc rvr. 
Then 

V (B^) 

1. VriB'n = '"^ ^ . (Note that for a = 0.85 this increment is « 3.6Vr(Bn.) 
1 — 



2. V^{B") = V^{B'') + 



aVriB"- 



l-a2 ■ 

3. VriB"-) > V^{B"-) ^ VriB"-) > (1 + a)V^{B'-). 

4- All other vertices (different from r and v) preserve their PageRank. □ 

For cyclical trees we have results similar to Theorems I9.1H9T31 but factoring out by 
1/(1 - a') in place of 1/(1 - a^). 

Now, the pruning of the lower levels of a bidirectional tree has mix consequences for 
the PageRank of the root, as opposed to the positive results obtained for unidirectional 
trees in section [5l We illustrate the possible outcomes of pruning lower levels of a 
bidirectional trees in the figures below. 

In the tree shown in Figure [3l for n < 75 and for all m > 1, successive removal of 
the m vertices of the last level increments the PageRank of the root, V{1). For n > 76 
and for all m > 1, successive removal of the m vertices of the last level decrements 
^(l). On the other hand, in the tree shown in Figured! for n < 31 and for all m > 1, 
successive removal of the m vertices of the last level increments 'P(l). For n > 32 and 
for all m > 1, successive removal of the m vertices of the last level decrements 'P(l). 
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Figure 3: case od{r) = Figure 4: case od{r) = 1 



The previous results give us some clues on ways of optimising PageRank of tree-like 
organised sites. Obviously these rules for rearrangement should apply insofar as the 
context allows H 

Rule 1 To augment the PageRank of the root transform incoming arcs bidirectional. 
Furthermore, link the root with vertices below in the tree (so that cycles passing 
by the root are build). 

Rule 2 To augment the PageRank of a vertex u different from the root, link u with 
a bidirectional arc to each one of the vertices on the subtree with root u (hence 
obtaining a cyclical tree). Keep in mind that this enhances the PageRank of u 
but reduces the PageRank of the root. One may interpret this action as linking 
an individual with all its subordinates in a hierarchical organisation. 

10 More complex topologies 

The next natural step is to upgrade the preceding results on bidirectional and cyclical 
trees to finite cyclic structures which can be modelled by our infinite trees. In order to 
achieve this further extensions we should then visualise an arbitrary digraph through 
its condensation digraph as the acyclic digraph consisting of its strongly connected 
components. 

A digraph T> = {V, A) is strongly connected if for each pair u and v of distinct 
vertices, there is a path joining u with v and a path joining v with u. Define u = 
V provided there are paths joining u with v and v with u. This is an equivalence 
relation and, in consequence, V is partitioned into equivalence classes Vi, . . . ,Vp. The 
p subdigraphs = {Vi,A/Vi) induced on the sets Vi, i = 1, . . . ,p, are the strong 
connected components of P. The digraph T> is strongly connected if and only if it 
has exactly one strong component. The condensation digraph of the digraph D is 
the acyclic digraph whose vertices are the strong connected components, or SCO, of V, 
and there is an arc from one SCC Vi to another Dj, i ^ j, if a vertex of Vi is adjacent 
in P to a vertex of Vj. 

Now, the extension of our techniques and procedures to a more general digraph 
requires that its corresponding condensation digraph be a rooted tree whose SCC 
behave like the cyclical rooted trees. Not all SCC may have the required behaviour, 
but many do so, and the key is that each SCC should have a root through which it 

^We are aware that some of the rules hsted here (and more that could be derived from our results) 
are, to some extend, already in use by web masters and SEO analysts, but as far as we have seen, 
without much mathematical justification. 
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Figure 5: Digraph V and the PR-digraphs that can be derived from it. 



connects to the rest of the tree (and by no other vertex), and two such SCC which are 
adjacent in the condensation digraph are hnked by just one arc in the original digraph. 
The SCC which have these properties we shall called PR-digraph. 

Formally, a PR— digraph with root r is a strongly connected digraph with at 
least one vertex r (the root of the PR-digraph) such that for all vertex v {v r\ 
there is a unique path joining v with r. In this structure we would say that a vertex 
V is at level 1S}\^ if the path that connects v with r is of length k. Now, in essence, a 
PR-digraph is much like a cyclical tree in as much as it can be seen as a tree with 
cycles formed by adding arcs from one level up to another level down. The key is that 
a PR-digraph admits a corresponding infinite tree due to the fact that each vertex has 
been assigned a unique level of the graph, or in other words, a unique path to r. Note 
that the root as the rest of the vertices may have out~degree greater or equal to 1. As 
an illustration of structures that can be PR-digraphs or not see Figure O There, in 
the strongly connected digraph T> the vertex 1 can not be the root of a PR-digraph 
since vertex 2 has two paths towards 1. On the contrary, the vertices 2 and 3 can be 
roots of a PR-digraph V. 

Also the complete digraph {(1, 2), (2, 1), (1, 3), (3, 1), (2, 3), (3, 2)} can not be a PR- 
digraph for any of its vertices. But, the strongly connected digraph {(1, 2), (2, 1), (2, 3), 
(3, 2)} is a PR-digraph rooted at any of its three vertices. 

The condition characterising a PR-digraph must also apply to the connections 
among SCC which are PR-digraphs. It can not be the case that in a tree of SCCs, 
which are PR-digraphs, one such SCC connects to another SCC in the tree through 
two arcs or more; that is, in the original digraph there must be a root (which itself 
could be the root of a PR-digraph) and it must be the case that each vertex v connects 
to the root by a unique path. On the other hand, we must admit the possibility of 
producing cycles of length s > 1 in this structure by connecting a vertex at the level 
Nk^i_s with a vertex at level A^^. 

We shall then define a PR— digraph tree with root digraph V = {V, A) 

whose set of arcs A can be partitioned in two disjoint sets Ai and A2 such that: 

• {V,Ai) is a partial digraph whose condensation digraph is a tree of SCCs which 
are PR-digraphs, each pair of adjacent PR-digraphs are linked by a unique arc 
and the maximal PR-digraph contains the root r (the underlying digraph of T>); 
and 

• if nt> € A2 then there is a path V1V2 ■ ■ .Vg-iVs, beginning at v = vi, ending at 
u = Vs and with intermediate vertices and arcs fjfj+i in Ai, and in this case we 
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Figure 6: Digraph which is not a PR-digraph tree. 



say that v is the origin of the cycle VV2 ■ ■ ■ Vs-iuv. 

Note that this time the arc uv, as well as the cycle VV2 ■ ■ ■ Vg-iuv, could be in the partial 
digraph {V, Ai). We show in Figure [6] a digraph T> that can not be a PR-digraph tree. 
This is due to the fact that the left-side SCC of T> is not compatible with a PR-digraph 
tree, since the vertex u has out degree 2 towards the root r. Also in the SCC on the 
right branch of P either one of the arcs xy or zw represents a surplus that forbids T> 
from being a PR-digraph tree. 

Theorem 10.1 A PR-digraph tree with root r is a cyclical tree with root r. 

Proof: Let T) = {V, A) be a PR-digraph tree with root r. It is sufficient to prove that 
the underlying digraph, C = {V, Ai), of 2? is a cyclical tree with root r. The set of arcs 
Ai can be partitioned in two disjoint sets: An = {vu : there is a path vu . . . r in C} 
and Ai2 = Ai\Aii. The digraph {V, An) is a tree rooted at r because, by the definition 
of the underlying digraph C, each vertex of V is joined with the root r by a unique 
path. Then {V,Aii) is the underlying directed tree of C and, moreover, if uv belongs 
to Ai2 then uv is in a SCC that is a PR-digraph with some root r' . By the strong 
connection, there is a path joining v to n, and thus C is a cyclical tree. □ 

As a consequence of this theorem we can compute the PageRank of the root r of 
a PR-digraph tree by a similar formula as given in section 18.11 for cyclical trees. In 
Figure [7] we exhibit a PR-digraph tree "D and its representation as a cyclical tree. 

The PR-digraph trees are the most general cyclical structures which can be in- 
terpreted as unidirectional infinite trees, and on which we can apply the optimisation 
techniques displayed in this article by treating each SCC as one unit. This could also 
revert on a speed up on the PageRank calculation. More explicitly, the last point we 
want to call attention to is the following. There are several approaches in the litera- 
ture to the task of speeding up the calculation of PageRank, based upon the following 
general scheme (see, for example, [TT| [2l l6]): 

Partition the web into local subwebs; then compute some independent rank- 
ing for each local subweb, which will apply to the whole subweb treated as 
a unit; and then compute the ranking of the graph of subwebs. 

In [2] and [6j the local splitting of the web is done in strongly connected compo- 
nents, and further in [6, Thm 2.1], it is shown that the PageRank can be calculated 



22 



r r 




t 

Figure 7: PR-digraph tree and associated cyclical tree. 

independently on each SCC, provided we know the PageRank of all vertices outside 
the SCC, but directly linking to vertices in the SCC. Our PR-digraph tree is the most 
simple splitting of the web in the way of [2] and [6], namely as SCC, with the addi- 
tional strongest condition of having a single link between components, which by the 
previously mentioned result of [6] can have PageRank computed independently on each 
SCC, and on a very simple way, provided we know the PageRank of their descendants 
in the topological structure of the tree. This suggests computing PageRank in parallel 
and through layers, as it is proposed in (BJ §3], following an iterated process on the tree 
from a top level down to the root at Nq. The PR-digraph is a suitable structure 
for the application of this process. 
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